大多数经典的大满贯系统都依赖于静态场景假设,这限制了其在现实世界中的适用性。最近提出了最近的SLAM框架来同时跟踪相机和移动对象。但是,他们通常无法估计物体的规范姿势并表现出低对象跟踪精度。为了解决这个问题,我们提出了Twistslam ++,这是一种语义,动态的,全动态的,可融合立体声图像和LiDAR信息。使用语义信息,我们跟踪可能移动对象,并将它们与LIDAR扫描中的3D对象检测相关联,以获得其姿势和尺寸。然后,我们对连续对象扫描进行注册以完善对象姿势估计。最后,使用对象扫描来估计对象的形状,并约束MAP点位于BA内的估计表面上。我们在经典的基准上表明,基于多模式信息的这种融合方法提高了对象跟踪的准确性。
translated by 谷歌翻译
经典的视觉同时定位和映射(SLAM)算法通常假设环境是刚性的。此假设限制了这些算法的适用性,因为它们无法准确估算包含移动物体的现实生活场景中的相机姿势和世界结构(例如汽车,自行车,行人等)。为了解决这个问题,我们提出了Twistlam:一种语义,动态和立体声猛击系统,可以跟踪环境中的动态对象。我们的算法根据其语义类创建积分群。得益于通过机械关节建模的集群间约束(语义类的功能)的定义,因此,新颖的约束束调整能够共同估计移动物体的姿势和速度以及古典世界结构和摄像机轨迹。我们对公共Kitti数据集的多个序列进行了评估,并定量证明它与最新方法相比改进了相机和对象跟踪。
translated by 谷歌翻译
This contribution demonstrates the feasibility of applying Generative Adversarial Networks (GANs) on images of EPAL pallet blocks for dataset enhancement in the context of re-identification. For many industrial applications of re-identification methods, datasets of sufficient volume would otherwise be unattainable in non-laboratory settings. Using a state-of-the-art GAN architecture, namely CycleGAN, images of pallet blocks rotated to their left-hand side were generated from images of visually centered pallet blocks, based on images of rotated pallet blocks that were recorded as part of a previously recorded and published dataset. In this process, the unique chipwood pattern of the pallet block surface structure was retained, only changing the orientation of the pallet block itself. By doing so, synthetic data for re-identification testing and training purposes was generated, in a manner that is distinct from ordinary data augmentation. In total, 1,004 new images of pallet blocks were generated. The quality of the generated images was gauged using a perspective classifier that was trained on the original images and then applied to the synthetic ones, comparing the accuracy between the two sets of images. The classification accuracy was 98% for the original images and 92% for the synthetic images. In addition, the generated images were also used in a re-identification task, in order to re-identify original images based on synthetic ones. The accuracy in this scenario was up to 88% for synthetic images, compared to 96% for original images. Through this evaluation, it is established, whether or not a generated pallet block image closely resembles its original counterpart.
translated by 谷歌翻译
We leverage path differentiability and a recent result on nonsmooth implicit differentiation calculus to give sufficient conditions ensuring that the solution to a monotone inclusion problem will be path differentiable, with formulas for computing its generalized gradient. A direct consequence of our result is that these solutions happen to be differentiable almost everywhere. Our approach is fully compatible with automatic differentiation and comes with assumptions which are easy to check, roughly speaking: semialgebraicity and strong monotonicity. We illustrate the scope of our results by considering three fundamental composite problem settings: strongly convex problems, dual solutions to convex minimization problems and primal-dual solutions to min-max problems.
translated by 谷歌翻译
Recent studies have revealed that, beyond conventional accuracy, calibration should also be considered for training modern deep neural networks. To address miscalibration during learning, some methods have explored different penalty functions as part of the learning objective, alongside a standard classification loss, with a hyper-parameter controlling the relative contribution of each term. Nevertheless, these methods share two major drawbacks: 1) the scalar balancing weight is the same for all classes, hindering the ability to address different intrinsic difficulties or imbalance among classes; and 2) the balancing weight is usually fixed without an adaptive strategy, which may prevent from reaching the best compromise between accuracy and calibration, and requires hyper-parameter search for each application. We propose Class Adaptive Label Smoothing (CALS) for calibrating deep networks, which allows to learn class-wise multipliers during training, yielding a powerful alternative to common label smoothing penalties. Our method builds on a general Augmented Lagrangian approach, a well-established technique in constrained optimization, but we introduce several modifications to tailor it for large-scale, class-adaptive training. Comprehensive evaluation and multiple comparisons on a variety of benchmarks, including standard and long-tailed image classification, semantic segmentation, and text classification, demonstrate the superiority of the proposed method. The code is available at https://github.com/by-liu/CALS.
translated by 谷歌翻译
盲源分离(BSS)算法是无监督的方法,通过允许物理有意义的数据分解,它们是高光谱数据分析的基石。 BSS问题不足,解决方案需要有效的正则化方案,以更好地区分来源并产生可解释的解决方案。为此,我们研究了一种半监督的源分离方法,在这种方法中,我们将预测的交替最小二乘算法与基于学习的正则化方案结合在一起。在本文中,我们专注于通过使用生成模型来限制混合矩阵属于学习的歧管。总而言之,我们表明,这允许具有创新的BSS算法,具有提高的精度,可提供物理上可解释的解决方案。在涉及强噪声,高度相关的光谱和不平衡来源的挑战性场景中,对现实的高光谱天体物理数据进行了测试。结果突出了在减少来源之间的泄漏之前,学到的重大好处,这可以使总体上更好的分解。
translated by 谷歌翻译
本文考虑了在线配置器通常使用的一组替代方案中学习用户偏好的任务。在许多设置中,学习者在过去的互动过程中只有一组选定的替代方案。Fargier等。[2018]提出了一种在这种环境中学习用户偏好模型的方法,该模型对先前选择的替代方案进行了排名尽可能高;以及在这种情况下学习的算法,是一种特定的偏好模型:词典偏好树(LP-Trees)。在本文中,我们研究了与这种方法相关的复杂性理论问题。我们对学习LP-Tree的样本复杂性给出了上限,这在属性数量上是对数。我们还证明,计算最小化经验风险的LP树当仅限于线性LP-Trees的类别时,可以在多项式时间内完成。
translated by 谷歌翻译
我们考虑根据视觉检测自动移动机器人异常的任务。我们对相关类型的视觉异常进行分类,并讨论如何通过无监督的深度学习方法检测到它们。我们提出了一个专门为此任务构建的新型数据集,并在该任务上测试了最先进的方法。我们终于在实际情况下讨论部署。
translated by 谷歌翻译
我们考虑为移动机器人构建视觉异常检测系统的问题。标准异常检测模型是使用仅由非异常数据组成的大型数据集训练的。但是,在机器人技术应用中,通常可以使用(可能很少)的异常示例。我们解决了利用这些数据以通过与Real-NVP损失共同使辅助外离群损失损失共同使实际NVP异常检测模型的性能提高性能的问题。我们在新的数据集(作为补充材料)上进行定量实验,该数据集在室内巡逻方案中设计为异常检测。在不连接测试集中,我们的方法优于替代方案,并表明即使少数异常框架也可以实现重大的性能改进。
translated by 谷歌翻译
深度学习(DL)技术被回归问题所接受。最近在该领域发表的论文数量越来越多,包括调查和评论,表明,由于效率和具有高维数据的系统的良好精度,深层回归引起了社区的关注。但是,许多DL方法具有复杂的结构,这些结构对人类用户不易透明。访问这些模型的可解释性是解决敏感领域问题(例如网络安全系统,医疗,金融监视和工业过程)的重要因素。模糊逻辑系统(FLS)是可解释的模型,在文献中众所周知,能够通过具有成员资格学位的语言术语对复杂系统使用非线性表示,模仿了人类的思想。在可解释的人工智能的气氛中,有必要考虑开发智能模型的准确性和可解释性之间的权衡。本文旨在调查结合DL和FL的现有方法的最新方法,即深度模糊系统,以解决回归问题,配置当前在文献中尚不充分探索的主题,因此应进行全面调查。
translated by 谷歌翻译